observation matrix
Reinforcement learning for graph theory, Parallelizing Wagner's approach
Our work applies reinforcement learning to construct counterexamples concerning conjectured bounds on the spectral radius of the Laplacian matrix of a graph. We expand upon the re-implementation of Wagnar's approach by Stevanovic et al. with the ability to train numerous unique models simultaneously and a novel redefining of the action space to adjust the influence of the current local optimum on the learning process.
Dynamic Parameter Identification of a Curtain Wall Installation Robotic Arm
Liu, Xiao, Cheng, Yunxiao, Wang, Weijun, Huang, Tianlun, Feng, Wei
In the construction industry, traditional methods fail to meet the modern demands for efficiency and quality. The curtain wall installation is a critical component of construction projects. We design a hydraulically driven robotic arm for curtain wall installation and a dynamic parameter identification method. We establish a Denavit-Hartenberg (D-H) model based on measured robotic arm structural parameters and integrate hydraulic cylinder dynamics to construct a composite parametric system driven by a Stribeck friction model. By designing high-signal-to-noise ratio displacement excitation signals for hydraulic cylinders and combining Fourier series to construct optimal excitation trajectories that satisfy joint constraints, this method effectively excites the characteristics of each parameter in the minimal parameter set of the dynamic model of the robotic arm. On this basis, a hierarchical progressive parameter identification strategy is proposed: least squares estimation is employed to separately identify and jointly calibrate the dynamic parameters of both the hydraulic cylinder and the robotic arm, yielding Stribeck model curves for each joint. Experimental validation on a robotic arm platform demonstrates residual standard deviations below 0.4 Nm between theoretical and measured joint torques, confirming high-precision dynamic parameter identification for the hydraulic-driven curtain wall installation robotic arm. This significantly contributes to enhancing the intelligence level of curtain wall installation operations.
Decentralized Mobile Target Tracking Using Consensus-Based Estimation with Nearly-Constant-Velocity Modeling
Ghods, Amir Ahmad, Doostmohammadian, Mohammadreza
Mobile target tracking is crucial in various applications such as surveillance and autonomous navigation. This study presents a decentralized tracking framework utilizing a Consensus-Based Estimation Filter (CBEF) integrated with the Nearly-Constant-Velocity (NCV) model to predict a moving target's state. The framework facilitates agents in a network to collaboratively estimate the target's position by sharing local observations and achieving consensus despite communication constraints and measurement noise. A saturation-based filtering technique is employed to enhance robustness by mitigating the impact of noisy sensor data. Simulation results demonstrate that the proposed method effectively reduces the Mean Squared Estimation Error (MSEE) over time, indicating improved estimation accuracy and reliability. The findings underscore the effectiveness of the CBEF in decentralized environments, highlighting its scalability and resilience in the presence of uncertainties.
Learning Overcomplete HMMs
Vatsal Sharan, Sham M. Kakade, Percy S. Liang, Gregory Valiant
We study the problem of learning overcomplete HMMs--those that have many hidden states but a small output alphabet. Despite having significant practical importance, such HMMs are poorly understood with no known positive or negative results for efficient learning. In this paper, we present several new results--both positive and negative--which help define the boundaries between the tractable and intractable settings. Specifically, we show positive results for a large subclass of HMMs whose transition matrices are sparse, well-conditioned, and have small probability mass on short cycles. On the other hand, we show that learning is impossible given only a polynomial number of samples for HMMs with a small output alphabet and whose transition matrices are random regular graphs with large degree. We also discuss these results in the context of learning HMMs which can capture long-term dependencies.
The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
Zamboni, Riccardo, Cirino, Duilio, Restelli, Marcello, Mutti, Mirco
The problem of pure exploration in Markov decision processes has been cast as maximizing the entropy over the state distribution induced by the agent's policy, an objective that has been extensively studied. However, little attention has been dedicated to state entropy maximization under partial observability, despite the latter being ubiquitous in applications, e.g., finance and robotics, in which the agent only receives noisy observations of the true state governing the system's dynamics. How can we address state entropy maximization in those domains? In this paper, we study the simple approach of maximizing the entropy over observations in place of true latent states. First, we provide lower and upper bounds to the approximation of the true state entropy that only depends on some properties of the observation function. Then, we show how knowledge of the latter can be exploited to compute a principled regularization of the observation entropy to improve performance. With this work, we provide both a flexible approach to bring advances in state entropy maximization to the POMDP setting and a theoretical characterization of its intrinsic limits.
Online Multi-IMU Calibration Using Visual-Inertial Odometry
Hartzer, Jacob, Saripalli, Srikanth
This work presents a centralized multi-IMU filter framework with online intrinsic and extrinsic calibration for unsynchronized inertial measurement units that is robust against changes in calibration parameters. The novel EKF-based method estimates the positional and rotational offsets of the system of sensors as well as their intrinsic biases without the use of rigid body geometric constraints. Additionally, the filter is flexible in the total number of sensors used while leveraging the commonly used MSCKF framework for camera measurements. The filter framework has been validated using Monte Carlo simulation as well as experimentally. In both simulations and experiments, using multiple IMU measurement streams within the proposed filter framework outperforms the use of a single IMU in a filter prediction step while also producing consistent and accurate estimates of initial calibration errors. Compared to current state-of-the-art optimizers, the filter produces similar intrinsic and extrinsic calibration parameters for each sensor. Finally, an open source repository has been provided at https://github.com/unmannedlab/ekf-cal containing both the online estimator and the simulation used for testing and evaluation.